Prosodically modifying speech for unit selection speech synthesis databases
نویسندگان
چکیده
This paper investigates the practical limits of artificially increasing the prosodic richness of a unit selection database by transforming the prosodic realization of constituent sentences. The resulting high-quality transformed sentences are added to the database as new material. We examine in detail one of the most challenging prosodic transformations, namely converting statements into yes/no questions. Such transformations can require very large prosodic modifications while at the same time there is a need to retain as much naturalness of the signal as possible. Our data-driven approach relies on learning templates of pitch contours for different stress patterns of interrogative sentences from training data and later on applying these template pitch contours on unseen statements to generate the corresponding questions. We examine experimentally how the modified signals contribute to the perceived synthesis quality of the resulting database when compared with baseline unmodified databases.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملOn building phonetically and prosodically rich speech corpus for text-to-speech synthesis
This paper proposes a way of preparing and recording a speech corpus for unit selection text-to-speech speech synthesis driven by symbolic prosody. The research is focused on a phonetically and prosodically rich sentence selection algorithm. Symbolic description on a deep prosody level is used to enrich the phonetic representation of sentences (by respecting the prosodeme types phones appear in...
متن کاملBuilding of a Speech Corpus Optimised for Unit Selection TTS Synthesis
The paper deals with the process of designing a phonetically and prosodically rich speech corpus for unit selection speech synthesis. The attention is given mainly to the recording and verification stage of the process. In order to ensure as high quality and consistency of the recordings as possible, a special recording environment consisting of a recording session management and “pluggable” ch...
متن کاملRecording and Annotation of Speech Corpus for Czech Unit Selection Speech Synthesis
The paper gives a brief summarisation of preparation and recording of a phonetically and prosodically rich speech corpus for Czech unit selection text-to-speech synthesis. Special attention is paid to the process of two-phase orthographic annotations of recorded sentences with regard to their coherence.
متن کاملPerceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis
Prosody is an important factor in the quality of text-tospeech (TTS) synthesis. Typically, acoustic parameters such as f0 and duration are the only variables related to prosody that are used to determine unit selection. Our study explored adding the explicit use of linguistically and perceptually motivated prosodic categories in unit selection-based TTS. One of our goals was to automate the pro...
متن کامل